Helping Term Sense Disambiguation with Active Learning
نویسندگان
چکیده
Our research highlights the problem of term polysemy within terminometrics studies. Terminometrics is the measure of term usage in specialized communication. Polysemy, especially within single-word terms as we will show, prevents using term corpus frequencies as appropriate statistics for terminometrics. Automatic term sense disambiguation, as a possible solution, requires human annotation to feed a supervised learning algorithm. Within our experiments, we show that although being polysemous, terms have a strong in-domain sense bias, making random sampling of annotation data less than optimal. We suggest the use of active learning and implement it within an annotation platform as a way of reducing annotation time.
منابع مشابه
Applying active learning to supervised word sense disambiguation in MEDLINE
OBJECTIVES This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models. METHODS We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collec...
متن کاملLearning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification
In this paper, we address the problem of knowing when to stop the process of active learning. We propose a new statistical learning approach, called minimum expected error strategy, to defining a stopping criterion through estimation of the classifier’s expected error on future unlabeled examples in the active learning process. In experiments on active learning for word sense disambiguation and...
متن کاملDomain Adaptation with Active Learning for Word Sense Disambiguation
When a word sense disambiguation (WSD) system is trained on one domain but applied to a different domain, a drop in accuracy is frequently observed. This highlights the importance of domain adaptation for word sense disambiguation. In this paper, we first show that an active learning approach can be successfully used to perform domain adaptation of WSD systems. Then, by using the predominant se...
متن کاملActive Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem
In this paper, we analyze the effect of resampling techniques, including undersampling and over-sampling used in active learning for word sense disambiguation (WSD). Experimental results show that under-sampling causes negative effects on active learning, but over-sampling is a relatively good choice. To alleviate the withinclass imbalance problem of over-sampling, we propose a bootstrap-based ...
متن کاملBringing Active Learning to Life
Active learning has been applied to different NLP tasks, with the aim of limiting the amount of time and cost for human annotation. Most studies on active learning have only simulated the annotation scenario, using prelabelled gold standard data. We present the first active learning experiment for Word Sense Disambiguation with human annotators in a realistic environment, using fine-grained sen...
متن کامل